AITopics | sparse region

Collaborating Authors

sparse region

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Isolation-based Spherical Ensemble Representations for Anomaly Detection

Cao, Yang, Yang, Sikun, Tian, Hao, He, Kai, Qi, Lianyong, Liu, Ming, Yang, Yujiu

arXiv.org Artificial IntelligenceOct-16-2025

Anomaly detection is a critical task in data mining and management with applications spanning fraud detection, network security, and log monitoring. Despite extensive research, existing unsupervised anomaly detection methods still face fundamental challenges including conflicting distributional assumptions, computational inefficiency, and difficulty handling different anomaly types. To address these problems, we propose ISER (Isolation-based Spherical Ensemble Representations) that extends existing isolation-based methods by using hypersphere radii as proxies for local density characteristics while maintaining linear time and constant space complexity. ISER constructs ensemble representations where hy-persphere radii encode density information: smaller radii indicate dense regions while larger radii correspond to sparse areas. We introduce a novel similarity-based scoring method that measures pattern consistency by comparing ensemble representations against a theoretical anomaly reference pattern. Additionally, we enhance the performance of Isolation Forest by using ISER and adapting the scoring function to address axis-parallel bias and local anomaly detection limitations. Comprehensive experiments on 22 real-world datasets demonstrate ISER's superior performance over 11 baseline methods. Anomaly detection is the task of identifying data points that deviate significantly from the majority of observations, with applications in fraud detection, network security, and quality control (Chandola et al., 2009; Liu et al., 2024; Tang et al., 2024; Song et al., 2023). Despite extensive research, developing effective unsupervised anomaly detection methods remains challenging due to several fundamental limitations. Existing methods face a critical trade-off between computational efficiency and handling varying local densities. Density-based methods like Local Outlier Factor (Breunig et al., 2000) address this but require quadratic time complexity, limiting scalability.

data mining, hypersphere, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2510.13311

Country: Asia > China (0.28)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (0.54)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Multi-Policy Pareto Front Tracking Based Online and Offline Multi-Objective Reinforcement Learning

Zhao, Zeyu, Che, Yueling, Liu, Kaichen, Li, Jian, Yao, Junmei

arXiv.org Artificial IntelligenceAug-5-2025

Multi-objective reinforcement learning (MORL) plays a pivotal role in addressing multi-criteria decision-making problems in the real world. The multi-policy (MP) based methods are widely used to obtain high-quality Pareto front approximation for the MORL problems. However, traditional MP methods only rely on the online reinforcement learning (RL) and adopt the evolutionary framework with a large policy population. This may lead to sample inefficiency and/or overwhelmed agent-environment interactions in practice. By forsaking the evolutionary framework, we propose the novel Multi-policy Pareto Front Tracking (MPFT) framework without maintaining any policy population, where both online and offline MORL algorithms can be applied. The proposed MPFT framework includes four stages: Stage 1 approximates all the Pareto-vertex policies, whose mapping to the objective space fall on the vertices of the Pareto front. Stage 2 designs the new Pareto tracking mechanism to track the Pareto front, starting from each of the Pareto-vertex policies. Stage 3 identifies the sparse regions in the tracked Pareto front, and introduces a new objective weight adjustment method to fill the sparse regions. Finally, by combining all the policies tracked in Stages 2 and 3, Stage 4 approximates the Pareto front. Experiments are conducted on seven different continuous-action robotic control tasks with both online and offline MORL algorithms, and demonstrate the superior hypervolume performance of our proposed MPFT approach over the state-of-the-art benchmarks, with significantly reduced agent-environment interactions and hardware requirements.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2508.02217

Country: Asia (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)

Add feedback

Evaluating Detection Thresholds: The Impact of False Positives and Negatives on Super-Resolution Ultrasound Localization Microscopy

Gharamaleki, Sepideh K., Helfield, Brandon, Rivaz, Hassan

arXiv.org Artificial IntelligenceNov-11-2024

Super-resolution ultrasound imaging with ultrasound localization microscopy (ULM) offers a high-resolution view of microvascular structures. Yet, ULM image quality heavily relies on precise microbubble (MB) detection. Despite the crucial role of localization algorithms, there has been limited focus on the practical pitfalls in MB detection tasks such as setting the detection threshold. This study examines how False Positives (FPs) and False Negatives (FNs) affect ULM image quality by systematically adding controlled detection errors to simulated data. Results indicate that while both FP and FN rates impact Peak Signal-to-Noise Ratio (PSNR) similarly, increasing FP rates from 0\% to 20\% decreases Structural Similarity Index (SSIM) by 7\%, whereas same FN rates cause a greater drop of around 45\%. Moreover, dense MB regions are more resilient to detection errors, while sparse regions show high sensitivity, showcasing the need for robust MB detection frameworks to enhance super-resolution imaging.

artificial intelligence, imaging, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2411.07426

Country: North America > Canada > Quebec > Montreal (0.05)

Genre: Research Report > Experimental Study (0.48)

Industry: Health & Medicine > Therapeutic Area (0.95)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

Heavy-Tailed Process Priors for Selective Shrinkage

Neural Information Processing SystemsApr-6-2023, 13:32:26 GMT

Heavy-tailed distributions are often used to enhance the robustness of regression and classification methods to outliers in output space. Often, however, we are confronted with outliers'' in input space, which are isolated observations in sparsely populated regions. We show that heavy-tailed process priors (which we construct from Gaussian processes via a copula), can be used to improve robustness of regression and classification estimators to such outliers by selectively shrinking them more strongly in sparse regions than in dense regions. We carry out a theoretical analysis to show that selective shrinkage occurs provided the marginals of the heavy-tailed process have sufficiently heavy tails. The analysis is complemented by experiments on biological data which indicate significant improvements of estimates in sparse regions while producing competitive results in dense regions.

outlier, selective shrinkage, sparse region, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.88)

Add feedback

In Defense of Core-set: A Density-aware Core-set Selection for Active Learning

Kim, Yeachan, Shin, Bonggun

arXiv.org Artificial IntelligenceJul-14-2022

Active learning enables the efficient construction of a labeled dataset by labeling informative samples from an unlabeled dataset. In a real-world active learning scenario, considering the diversity of the selected samples is crucial because many redundant or highly similar samples exist. Core-set approach is the promising diversity-based method selecting diverse samples based on the distance between samples. However, the approach poorly performs compared to the uncertainty-based approaches that select the most difficult samples where neural models reveal low confidence. In this work, we analyze the feature space through the lens of the density and, interestingly, observe that locally sparse regions tend to have more informative samples than dense regions. Motivated by our analysis, we empower the core-set approach with the density-awareness and propose a density-aware core-set (DACS). The strategy is to estimate the density of the unlabeled samples and select diverse samples mainly from sparse regions. To reduce the computational bottlenecks in estimating the density, we also introduce a new density approximation based on locality-sensitive hashing. Experimental results clearly demonstrate the efficacy of DACS in both classification and regression tasks and specifically show that DACS can produce state-of-the-art performance in a practical scenario. Since DACS is weakly dependent on neural architectures, we present a simple yet effective combination method to show that the existing methods can be beneficially combined with DACS.

artificial intelligence, machine learning, sparse region, (14 more...)

arXiv.org Artificial Intelligence

2206.04838

Country:

North America > United States > District of Columbia > Washington (0.05)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New York > New York County > New York City (0.04)
(3 more...)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Heavy-Tailed Process Priors for Selective Shrinkage

Wauthier, Fabian L., Jordan, Michael I.

Neural Information Processing SystemsFeb-15-2020, 03:43:58 GMT

outlier, selective shrinkage, sparse region, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.94)

Add feedback

A Domain Adaptive Density Clustering Algorithm for Data with Varying Density Distribution

Chen, Jianguo, Yu, Philip S.

arXiv.org Machine LearningNov-22-2019

Abstract-- As one type of efficient unsupervised learning methods, clustering algorithms have been widely used in data mining and knowledge discovery with noticeable advantages. However, clustering algorithms based on density peak have limited clustering effect on data with varying density distribution (VDD), equilibrium distribution (ED), and multiple domain-density maximums (MDDM), leading to the problems of sparse cluster loss and cluster fragmentation. T o address these problems, we propose a Domain-Adaptive Density Clustering (DADC) algorithm, which consists of three steps: domain-adaptive density measurement, cluster center self-identification, and cluster self-ensemble. For data with VDD features, clusters in sparse regions are often neglected by using uniform density peak thresholds, which results in the loss of sparse clusters. We treat each data point and its KNN neighborhood as a subgroup to better reflect its density distribution in a domain view. In addition, for data with ED or MDDM features, a large number of density peaks with similar values can be identified, which results in cluster fragmentation. We propose a cluster center self-identification and cluster self-ensemble method to automatically extract the initial cluster centers and merge the fragmented clusters. Experimental results demonstrate that compared with other comparative algorithms, the proposed DADC algorithm can obtain more reasonable clustering results on data with VDD, ED and MDDM features. Benefitting from a few parameter requirement and non-iterative nature, DADC achieves low computational complexity and is suitable for large-scale data clustering. Numerous clustering algorithms have been proposed, including the partitioning-based, hierarchical-based, density-based, grid-based, model-based, and density-peak-based methods [3-6]. Among them, density-based methods (e.g., DBSCAN, CLIQUE, and OPTICS) can effectively discover clusters of arbitrary shape using the density connectivity of clusters, and do not require a predefined number of clusters [6]. In recent years, Density-Peak-based Clustering (DPC) algorithms, as a branch of density-based clustering, were introduced in [7, 8], assuming that the cluster centers are surrounded by low-density neighbors and can be detected by efficiently searching for local density peaks. Benefitting from few parameter requirements and non-iterative nature, DPC algorithms can efficiently detect clusters of arbitrarily shape from large-scale datasets with low computational complexity . However, as shown in Figure 1, DPC algorithms have limited clustering effect on data with varying density distribution (VDD), multiple domain-density maximums (MDDM), or equilibrium distribution (ED).

algorithm, dataset, domain density, (14 more...)

arXiv.org Machine Learning

doi: 10.1109/TKDE.2019.2954133

1911.10293

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > California (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

Nearest-Neighbour-Induced Isolation Similarity and its Impact on Density-Based Clustering

Qin, Xiaoyu, Ting, Kai Ming, Zhu, Ye, Lee, Vincent CS

arXiv.org Machine LearningJun-30-2019

A recent proposal of data dependent similarity called Isolation Kernel/Similarity has enabled SVM to produce better classification accuracy. We identify shortcomings of using a tree method to implement Isolation Similarity; and propose a nearest neighbour method instead. We formally prove the characteristic of Isolation Similarity with the use of the proposed method. The impact of Isolation Similarity on density-based clustering is studied here. We show for the first time that the clustering performance of the classic density-based clustering algorithm DBSCAN can be significantly uplifted to surpass that of the recent density-peak clustering algorithm DP. This is achieved by simply replacing the distance measure with the proposed nearest-neighbour-induced Isolation Similarity in DBSCAN, leaving the rest of the procedure unchanged. A new type of clusters called mass-connected clusters is formally defined. We show that DBSCAN, which detects density-connected clusters, becomes one which detects mass-connected clusters, when the distance measure is replaced with the proposed similarity. We also provide the condition under which mass-connected clusters can be detected, while density-connected clusters cannot.

dataset, dbscan, isolation similarity, (14 more...)

arXiv.org Machine Learning

1907.00378

Country:

Oceania > Australia > Victoria (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia (0.04)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

Kapil Sharma

#artificialintelligenceAug-31-2018, 23:28:05 GMT

I previously wrote a post about Kernel Smoothing and how it can be used to fit a non-linear function non-parametrically. In this post, I will extend on that idea and try to mitigate the disadvantages of kernel smoothing using Local Linear Regression. I generated some data in my previous post and I will reuse the same data for this post. The data was generated from the function $\mathbf{y f(x) sin(4x) 2}$ with some Gaussian noise and here's how it looks: As I mentioned in the previous article, in kernel smoothing out-of-sample predictions on the edges and in sparse regions can have significant errors and bias. In Local Linear Regression, we try to reduce this bias to first order, by fitting straight lines instead of local constants.

artificial intelligence, machine learning, mathbf, (9 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.87)

Add feedback

Heavy-Tailed Process Priors for Selective Shrinkage

Wauthier, Fabian L., Jordan, Michael I.

Neural Information Processing SystemsDec-31-2010

Heavy-tailed distributions are often used to enhance the robustness of regression and classification methods to outliers in output space. Often, however, we are confronted with ``outliers'' in input space, which are isolated observations in sparsely populated regions. We show that heavy-tailed process priors (which we construct from Gaussian processes via a copula), can be used to improve robustness of regression and classification estimators to such outliers by selectively shrinking them more strongly in sparse regions than in dense regions. We carry out a theoretical analysis to show that selective shrinkage occurs provided the marginals of the heavy-tailed process have sufficiently heavy tails. The analysis is complemented by experiments on biological data which indicate significant improvements of estimates in sparse regions while producing competitive results in dense regions.

artificial intelligence, machine learning, shrinkage, (18 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback